NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Breaking the log(1/Δ_2) Barrier: Better Batched Best Arm Identification with Adaptive Grids

Jin, Tianyuan; Zhang, Qin; Zhou, Dongruo (April 2025, International Conference on Learning Representations (ICLR) 2025)

We investigate the problem of batched best arm identification in multi-armed bandits, where we aim to identify the best arm from a set of n arms while minimizing both the number of samples and batches. We introduce an algorithm that achieves near-optimal sample complexity and features an instance-sensitive batch complexity, which breaks the log(1/Δ_2) barrier. The main contribution of our algorithm is a novel sample allocation scheme that effectively balances exploration and exploitation for batch sizes. Experimental results indicate that our approach is more batch-efficient across various setups. We also extend this framework to the problem of batched best arm identification in linear bandits and achieve similar improvements.
more » « less
Free, publicly-accessible full text available April 1, 2026
Uncertainty-Aware Reward-Free Exploration with General Function Approximation

Zhang, Junkai; Zhang, Weitong; Zhou, Dongruo; Gu, Quanquan (July 2024, International Conference on Machine Learning)

Full Text Available
Risk Bounds of Accelerated SGD for Overparameterized Linear Regression

Li, Xuheng; Deng, Yihe; Wu, Jingfeng; Zhou, Dongruo; Gu, Quanquan (May 2024, International Conference on Learning Representations (ICLR))

Full Text Available
Risk Bounds of Accelerated SGD for Overparameterized Linear Regression

Li, Xuheng; Deng, Yihe; Wu, Jingfeng; Zhou, Dongruo; Gu, Quanquan (May 2024, International Conference on Learning Representations (ICLR))

Full Text Available
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization

Zhou, Dongruo; Chen, Jinghui; Cao, Yuan; Yang, Ziyan; Gu, Quanquan (March 2024, Transaction of Machine Learning Research)

Full Text Available
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

He, Jiafan; Zhao, Heyang; Zhou, Dongruo; Gu, Quanquan (January 2023, International Conference on Machine Learning (ICML))

Full Text Available
Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits

Zhao, Heyang; Zhou, Dongruo; He, Jiafan; Gu, Quanquan (January 2023, International Conference on Machine Learning (ICML))

Full Text Available
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path

Di, Qiwei; He, Jiafan; Zhou, Dongruo; Gu, Quanquan (January 2023, International Conference on Machine Learning (ICML))

Full Text Available
Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency

Zhao, Heyang; He, Jiafan; Zhou, Dongruo; Zhang, Tong; Gu, Quanquan (January 2023, Annual Conference on Learning Theory (COLT))

Full Text Available
Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RL

Zhang, Weitong; He, Jiafan; Zhou, Dongruo; Zhang, Amy; Gu, Quanquan (January 2023, International Conference on Uncertainty in Artificial Intelligence (UAI))

Full Text Available

« Prev Next »

Search for: All records